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At the pretrial conference on Wednesday, January 2, 2019, the Court heard oral argument on 
Defendants’ Motion in Umine to exclude the testimony of four witnesses who plaintiffs disclosed for 
the first time on December 12, 2018, well past the discovery cut-off deadline. To explain their delay 
in disclosing these witnesses, counsel for plaintiffs asserted that they only recendy became aware of 
the data-quality implications due to the inclusion of the citizenship question after the trial testimony 
of defendants’ expert witness, Dr. John Abowd, in the New York case. As the attached exhibits 
demonstrate, Dr. Abowd consistently has stated in his deposition testimony, his expert report, and 
in the memoranda he prepared that are contained in the administrative record, that there will be data 
quality issues due to the inclusion of the citizenship question. Therefore, this justification for the 
dilatory disclosure of the witnesses should be rejected. 


Date: January 3, 2019 Respectfully submitted, 

JOSEPH H. HUNT 
Assistant Attorney General 

BRETT A. SHUMATE 
Deputy Assistant Attorney General 
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UNITED STATES DEPARTMENT OF COMMERCE 
Economics and Statistics Administration 
U.S. Census Bureau 

Washington. DC 20233-0001 


January 19, 2018 


MEMORANDUM FOR: 


Through: 


From: 


Subj ect: 


Wilbur L. Ross, Jr. 

Secretary of Commerce 

Karen Dunn Kelley 

Performing the Non-Exclusive Functions and Duties of the Deputy 
Secretary 

Ron S. Jarmin 

Performing the Non-Exclusive Functions and Duties of the Director 
Enrique Lamas 

Performing the Non-Exclusive Functions and Duties of the Deputy 
Director 

Jo hn M. Abowd 

Chief Scientist and Associate Director for Research and Methodology 

Technical Review of the Department of Justice Request to Add 
Citizenship Question to the 2020 Census 


The Department of Justice has requested block-level citizen voting-age population estimates by OMB- 
approved race and ethnicity categories from the 2020 Census of Population and Housing. These estimates 
are currently provided in two related data products: the PL94-171 redistricting data, produced by April 1st 
of the year following a decennial census under the authority of 13 U.S.C. Section 141, and the Citizen 
Voting Age Population by Race and Ethnicity (CVAP) tables produced every February from the most 
recent five-year American Community Survey data. The PL94-171 data are released at the census block 
level. The CVAP data are released at the census block group level. 

We consider three alternatives in response to the request: (A) no change in data collection, (B) adding a 
citizenship question to the 2020 Census, and (C) obtaining citizenship status from administrative records 
for the whole 2020 Census population. 


We recommend either Alternative A or C.fAltcrnativc C best meets DoJ’s stated uses, is comparatively 
far less costly than Alternative B, does not increase response burden, and does not harm the quality of the 
census count. Alternative A is not very costly and also does not harm the quality of the census count. 
Alternative B better addresses DoJ’s stated uses than Alternative A. However, Alternative B is very 
costly, harms the quality of the census count, and would use substantially less accurate citizenship status 
data than are available from administrative sources. 


^United States" 

Cens us 

Bureau 


00127 f~.gov 















Case 3:18-cv-01865-RS Document 147-1 Filed 01/03/19 Page 3 of 59 


Summary of Alternatives 


Alternative A 

Alternative B 

Alternative C 

Description 

No change in data 
collection 

Add citizenship 
question to the 2020 
Census (i.e., the DoJ 
request), all 2020 

Census microdata 
remain within the 

Census Bureau 

Leave 2020 Census 
questionnaire as 
designed and add 
citizenship from 
administrative records, 
all 2020 Census 
microdata and any 
linked citizenship data 
remain within the 

Census Bureau 

Impact on 2020 

Census 

None 

Major potential quality 
and cost disruptions 

None 

Quality of Citizen 
Voting-Age Population 
Data 

Status quo 

Block-level data 
improved, but with 
serious quality issues 
remaining 

Best option for block- 
level citizenship data, 
quality much improved 

Other Advantages 

Lowest cost alternative 

Direct measure of self- 
reported citizenship for 
the whole population 

Administrative 
citizenship records 
more accurate than self- 
reports, incremental 
cost is very likely to be 
less than $2M, USCIS 
data would permit 
record linkage for many 
more legal resident 
noncitizens 

Shortcomings 

Citizen voting-age 
population data remain 
the same or are 
improved by using 
small-area modeling 
methods 

Citizenship status is 
misreported at a very 
high rate for 
noncitizens, citizenship 
status is missing at a 
high rate for citizens 
and noncitizens due to 
reduced self-response 
and increased item 
nonresponse, 
nonresponse followup 
costs increase by at 
least $27.5M, 
erroneous enumerations 
increase, whole-person 
census imputations 
increase 

Citizenship variable 
integrated into 2020 
Census microdata 
outside the production 
system, Memorandum 
of Understanding with 
United States Citizen 
and Immigration 

Services required to 
acquire most up-to-date 
naturalization data 


Approved: _ Date: _ 

John M. Abowd, Chief Scientist 

and Associate Director for Research and Methodology 
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Detailed Analysis of Alternatives 

The statistics in this memorandum have been released by the Census Bureau Disclosure Review Board 
with approval number CBDRB-2018-CDAR-014. 

Alternative A: Make no changes 

Under this alternative, we would not change the current 2020 Census questionnaire nor the planned 
publications from the 2020 Census and the American Community Survey (ACS). Under this alternative, 
the PL94-171 redistricting data and the citizen voting-age population (CVAP) data would be released on 
the current schedule and with the current specifications. The redistricting and CVAP data are used by the 
Department of Justice to enforce the Voting Rights Act. They are also used by state redistricting offices to 
draw congressional and legislative districts that conform to constitutional equal-population and Voting 
Rights Act nondiscrimination requirements. Because the block-group-level CVAP tables have associated 
margins of error, their use in combination with the much more precise block-level census counts in the 
redistricting data requires sophisticated modeling. For these purposes, most analysts and the DoJ use 
statistical modeling methods to produce the block-level eligible voter data that become one of the inputs 
to their processes. 

If the DoJ requests the assistance of Census Bureau statistical experts in developing model-based 
statistical methods to better facilitate the DoJ’s uses of these data in performing its Voting Rights Act 
duties, a small team of Census Bureau experts similar in size and capabilities to the teams used to provide 
the Voting Rights Act Section 203 language determinations would be deployed. 

We estimate that this alternative would have no impact on the quality of the 2020 Census because there 
woidd be no change to any of the parameters underling the Secretary’s revised life-cycle cost estimates. 
The estimated cost is about $350,000 because that is approximately the cost of resources that would be 
used to do the modeling for the DoJ. 

Alternative B: Add the question on citizenship to the 2020 Census questionnaire 

Under this alternative, we would add the ACS question on citizenship to the 2020 Census questionnaire 
and 1SR instrument. We would then produce the block-level citizen voting-age population by race and 
ethnicity tables during the 2020 Census publication phase. 

Since the question is already asked on the American Community Survey, we would accept the cognitive 
research and questionnaire testing from the ACS instead of independently retesting the citizenship 
question. This means that the cost of preparing the new question would be minimal. We did not prepare 
an estimate of the impact of adding the citizenship question on the cost of reprogramming the Internet 
Self-Response (1SR) instrument, revising the Census Questionnaire Assistance (CQA), or redesigning the 
printed questionnaire because those components will not be finalized until after the March 2018 
submission of the final questions. Adding the citizenship question is similar in scope and cost to recasting 
the race and ethnicity questions again, should that become necessary, and would be done at the same time. 
After the 2020 Census ISR, CQA and printed questionnaire are in final form, adding the citizenship 
question would be much more expensive and would depend on exactly when the implementation decision 
was made during the production cycle. 
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For these reasons, we analyzed Alternative B in terms of its adverse impact on the rate of voluntary 
cooperation via self-response, the resulting increase in nonresponse followup (NRFU), and the 
consequent effects on the quality of the self-reported citizenship data. Three distinct analyses support the 
conclusion of an adverse impact on self-response and, as a result, on the accuracy and quality of the 2020 
Census. We assess the costs of increased NRFU in light of the results of these analyses. 

B.l. Quality of citizenship responses 

We considered the quality of the citizenship responses on the ACS. In this analysis we estimated item 
nonresponse rates for the citizenship question on the ACS from 2013 through 2016. When item 
nonresponse occurs, the ACS edit and imputation modules are used to allocate an answer to replace the 
missing data item. This results in lower quality data because of the statistical errors in these allocation 
models. The analysis of the self-responses responses is done using ACS data from 2013-2016 because of 
operational changes in 2013, including the introduction of the 1SR option and changes in the followup 
operations for mail-in questionnaires. 

In the period from 2013 to 2016, item nonresponse rates for the citizenship question on the mail-in 
questionnaires for non-Hispanic whites (NFTW) ranged from 6.0% to 6.3%, non-Hispanic blacks (NHB) 
ranged from 12.0% to 12.6%, and Flispanics ranged from 11.6 to 12.3%. In that same period, the ISR item 
nonresponse rates for citizenship were greater than those for mail-in questionnaires. In 2013, the item 
nonresponse rates for the citizenship variable on the ISR instrument were NHW: 6.2%, NFIB: 12.3% and 
Flispanic: 13.0%. By 2016 the rates increased for NFIB and especially Hispanics. They were NHW: 6.2%, 
NHB: 13.1%, and Hispanic: 15.5% (a 2.5 percentage point increase). Whether the response is by mail-in 
questionnaire or ISR instrument, item nonresponse rates for the citizenship question are much greater than 
the comparable rates for other demographic variables like sex, birthdate/age, and race/ethnicity (data not 
shown). 

B.2. Self-response rate analyses 

We directly compared the self-response rate in the 2000 Census for the short and long forms, separately 
for citizen and noncitizen households. In all cases, citizenship status of the individuals in the household 
was determined from administrative record sources, not from the response on the long form. A noncitizen 
household contains at least one noncitizen. Both citizen and noncitizen households have lower self¬ 
response rates on the long form compared to the short form; however, the decline in self-response for 
noncitizen households was 3.3 percentage points greater than the decline for citizen households. This 
analysis compared short and long form respondents, categories which were randomly assigned in the 
design of the 2000 Census. 

We compared the self-response rates for the same household address on the 2010 Census and the 2010 
American Community Survey, separately for citizen and noncitizen households. Again, all citizenship 
data were taken from administrative records, not the ACS, and noncitizen households contain at least one 
noncitizen resident. In this case, the randomization is over the selection of household addresses to receive 
the 2010 ACS. Because the ACS is an ongoing survey sampling fresh households each month, many of 
the residents of sampled households completed the 2010 ACS with the same reference address as they 
used for the 2010 Census. Once again, the self-response rates were lower in the ACS than in the 2010 
Census for both citizen and noncitizen households. In this 2010 comparison, moreover, the decline in self¬ 
response was 5.1 percentage points greater for noncitizen households than for citizen households. 
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In both the 2000 and 2010 analyses, only the long-form or ACS questionnaire contained a citizenship 
question. Both the long form and the ACS questionnaires are more burdensome than the shortform. 

Survey methodologists consider burden to include both the direct time costs of responding and the 
indirect costs arising from nonresponse due to perceived sensitivity of the topic. There are, consequently, 
many explanations for the lower self-response rates among all household types on these longer 
questionnaires. However, the only difference between citizen and noncitizen households in our studies 
was the presence of at least one noncitizen in noncitizen households. It is therefore a reasonable inference 
that a question on citizenship would lead to some decline in overall self-response because it would make 
the 2020 Census modestly more burdensome in the direct sense, and potentially much more burdensome 
in the indirect sense that it would lead to a larger decline in self-response for noncitizen households. 

B. 3. Breakoff rate analysis 

We examined the response breakoff paradata for the 2016 ACS. We looked at all breakoff screens on the 
ISR instrument, and specifically at the breakoffs that occurred on the screens with the citizenship and 
related questions like place of birth and year of entry to the U.S. Breakoff paradata isolate the point in 
answering the questionnaire where a respondent discontinues entering data—breaks off—rather than 
finishing. A breakoff is different from failure to self-respond. The respondent started the survey and was 
prepared to provide the data on the Internet Self-Response instrument, but changed his or her mind during 
the interview. 

Hispanics and non-Hispanic non-whites (NHNW) have greater breakoff rates than non-Hispanic whites 
(NHW). In the 2016 ACS data, breakoffs were NHW: 9.5% of cases while NHNW: 14.1% and Hispanics: 
17.6%. The paradata show the question on which the breakoff occurred. Only 0.04% of NHW broke off 
on the citizenship question, whereas NHNW broke off 0.27% and Hispanics broke off 0.36%. There are 
three related questions on immigrant status on the ACS: citizenship, place of birth, and year of entry to 
the United States. Considering all three questions Hispanics broke off on 1.6% of all ISR cases, NHNW: 
1.2% and NHW: 0.5%. A breakoff on the ISR instrument can result in follow-up costs, imputation of 
missing data, or both. Because Hispanics and non-Hispanic non-whites breakoff much more often than 
non-Hispanic whites, especially on the citizenship-related questions, their survey response quality is 
differentially affected. 

B. 4. Cost analysis 

Lower self-response rates would raise the cost of conducting the 2020 Census. We discuss those increased 
costs below. They also reduce the quality of the resulting data. Lower self-response rates degrade data 
quality because data obtained from NRFU have greater erroneous enumeration and whole-person 
imputation rates. An erroneous enumeration means a census person enumeration that should not have 
been counted for any of several reasons, such as, that the person (1) is a duplicate of a correct 
enumeration; (2) is inappropriate (e.g., the person died before Census Day); or (3) is enumerated in the 
wrong location for the relevant tabulation ( https://www.census.gov/coverage measurement/definitions/) . 
A whole-person census imputation is a census microdata record for a person for which all characteristics 
are imputed. 

Our analysis of the 2010 Census coverage errors (Census Coverage Measurement Estimation Report: 
Summary of Estimates of Coverage for Persons in the United States, Memo G-01) contains the relevant 
data. That study found that when the 2010 Census obtained a valid self-response (219 million persons), 
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the correct enumeration rate was 97.3%, erroneous enumerations were 2.5%, and whole-person census 
imputations were 0.3%. All erroneous enumeration and whole-person imputation rates are much greater 
for responses collected inNRFU. The vast majority ofNRFU responses to the 2010 Census (59 million 
persons) were collected in May. During that month, the rate of correct enumerations was only 90.2%, the 
rate of incorrect enumeration was 4.8%, and the rate of whole-person census imputations was 5.0%. June 
NRFU accounted for 15 million persons, of whom only 84.6% were correctly enumerated, with erroneous 
enumerations of 5.7%, and whole-person census imputations of 9.6%. (See Table 19 of 2010 Census 
Memorandum G-01. That table does not provide statistics for all NRFU cases in aggregate.) 

One reason that the erroneous enumeration and whole-person imputation rates are so much greater during 
NRFU is that the data are much more likely to be collected from a proxy rather than a household member, 
and, when they do come from a household member, that person has less accurate information than self¬ 
responders. The correct enumeration rate for NRFU household member interviews is 93.4% (see Table 21 
of 2010 Census Memorandum G-01), compared to 97.3% for non-NRFU households (see Table 19). The 
information for 21.0% of the persons whose data were collected during NRFU is based on proxy 
responses. For these 16 million persons, the correct enumeration rate is only 70.1%. Among proxy 
responses, erroneous enumerations are 6.7% and whole-person census imputations are 23.1% (see Table 
21 ). 

Using these data, we can develop a cautious estimate of the data quality consequences of adding the 
citizenship question. We assume that citizens are unaffected by the change and that an additional 5.1% of 
households with at least one noncitizen go into NRFU because they do not self-respond. We expect about 
126 million occupied households in the 2020 Census. From the 2016 ACS, we estimate that 9.8% of all 
households contain at least one noncitizen. Combining these assumptions implies an additional 630,000 
households in NRFU. If the NRFU data for those households have the same quality as the average NRFU 
data in the 2010 Census, then the result would be 139,000 fewer correct enumerations, of which 46,000 
are additional erroneous enumerations and 93,000 are additional whole-person census imputations. This 
analysis assumes that, during the NRFU operations, a cooperative member of the household supplies data 
79.0% of the time and 21.0% receive proxy responses. If all of these new NRFU cases go to proxy 
responses instead, the result would be 432,000 fewer correct enumerations, of which 67,000 are erroneous 
enumerations and 365,000 are whole-person census imputations. 

For Alternative B, our estimate of the incremental cost proceeds as follows. Using the analysis in the 
paragraph above, the estimated NRFU workload will increase by approximately 630,000 households, or 
approximately 0.5 percentage points. We currently estimate that for each percentage point increase in 
NRFU, the cost of the 2020 Census increases by approximately $55 million. Accordingly, the addition of 
a question on citizenship could increase the cost of the 2020 Census by at least $27.5 million. It is worth 
stressing that this cost estimate is a lower bound. Our estimate of $55 million for each percentage point 
increase in NRFU is based on an average of three visits per household. We expect that many more of 
these noncitizen households would receive six NRFU visits. 

We believe that $27.5 million is a conservative estimate because the other evidence cited in this report 
suggests that the differences between citizen and noncitizen response rates and data quality will be 
amplified during the 2020 Census compared to historical levels. Hence, the decrease in self-response for 
citizen households in 2020 could be much greater than the 5.1 percentage points we observed during the 
2010 Census. 
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Alternative C: Use administrative data on citizenship instead of add the question to the 2020 Census 

Under this alternative, we would add the capability to link an accurate, edited citizenship variable from 
administrative records to the final 2020 Census microdata files. We would then produce block-level tables 
of citizen voting age population by race and ethnicity during the publication phase of the 2020 Census 
using the enhanced 2020 Census microdata. 

The Census Bureau has conducted tests of its ability to link administrative data to supplement the 
decennial census and the ACS since the 1990s. Administrative record studies were performed for the 
1990, 2000 and 2010 Censuses. We discuss some of the implications of the 2010 study below. We have 
used administrative data extensively in the production of the economic censuses for decades. 
Administrative business data from multiple sources are a key component of the production Business 
Register, which provides the frames for the economic censuses, annual, quarterly, and monthly business 
surveys. Administrative business data are also directly tabulated in many of our products. 

In support of the 2020 Census, we moved the administrative data linking facility for households and 
individuals from research to production. This means that the ability to integrate administrative data at the 
record level is already part of the 2020 Census production environment. In addition, we began regularly 
ingesting and loading administrative data from the Social Security Administration, Internal Revenue 
Service and other federal and state sources into the 2020 Census data systems. In assessing the expected 
quality and cost of Alternative C, we assume the availability of these record linkage systems and the 
associated administrative data during the 2020 Census production cycle. 

C.l. Quality of administrate record versus self-report citizenship status 

We performed a detailed study of the responses to the citizenship question compared to the administrative 
record citizenship variable for the 2000 Census, 2010 ACS and 2016 ACS. These analyses confirm that 
the vast majority of citizens, as determined by reliable federal administrative records that require proof of 
citizenship, correctly report their status when asked a survey question. These analyses also demonstrate 
that when the administrative record source indicates an individual is not a citizen, the self-report is 
“citizen” for no less than 23.8% of the cases, and often more than 30%. 

For all of these analyses, we li nk ed the Census Bureau’s enhanced version of the SSA Numident data 
using the production individual record linkage system to append an administrative citizenship variable to 
the relevant census and ACS microdata. The Numident data contain information on every person who has 
ever been issued a Social Security Number or an Individual Taxpayer Identification Number. Since 1972, 
SSA has required proof of citizenship or legal resident alien status from applicants. We use this verified 
citizenship status as our administrative citizenship variable. Because noncitizens must interact with SSA 
if they become naturalized citizens, these data reflect current citizenship status albeit with a lag for some 
noncitizens. 

For our analysis of the 2000 Census long-form data, we linked the 2002 version of the Census Numident 
data, which is the version closest to the April 1, 2000 Census date. For 92.3% of the 2000 Census long- 
form respondents, we successfully linked the administrative citizenship variable. The 7.7% of persons for 
whom the administrative data are missing is comparable to the item non-response for self-responders in 
the mail-in pre-lSR-option ACS. When the administrative data indicated that the 2000 Census respondent 
was a citizen, the self-response was citizen: 98.8%. For this same group, the long-form response was 
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noncitizen: 0.9% and missing: 0.3%. By contrast, when the administrative data indicated that the 
respondent was not a citizen, the self-report was citizen: 29.9%, noncitizen: 66.4%, and missing: 3.7%. 

In the same analysis of 2000 Census data, we consider three categories of individuals: the reference 
person (the individual who completed the census form for the household), relatives of the reference 
person, and individuals unrelated to the reference person. When the administrative data show that the 
individual is a citizen, the reference person, relatives of the reference person, and nonrelatives of the 
reference person have self-reported citizenship status of 98.7%, 98.9% and 97.2%, respectively. On the 
other hand, when the administrative data report that the individual was a noncitizen, the long-form 
response was citizen for 32.9% of the reference persons; that is, reference persons who are not citizens 
according to the administrative data self-report that they are not citizens in only 63.3% of the long-form 
responses. When they are reporting for a relative who is not a citizen according to the administrative data, 
reference persons list that individual as a citizen in 28.6% of the long-form responses. When they are 
reporting for a nonrelative who is not a citizen according to the administrative data, reference persons list 
that individual as a citizen in 20.4% of the long-form responses. 

We analyzed the 2010 and 2016 ACS citizenship responses using the same methodology. The 2010 ACS 
respondents were linked to the 2010 version of the Census Numident. The 2016 ACS respondents were 
linked to the 2016 Census Numident. In 2010, 8.5% of the respondents could not be linked, or had 
missing citizenship status on the administrative data. In 2016, 10.9% could not be linked or had missing 
administrative data. We reached the same conclusions using 2010 and 2016 ACS data with the following 
exceptions. When the administrative data report that the individual is a citizen, the self-response is citizen 
on 96.9% of the 2010 ACS questionnaires and 93.8% of the 2016 questionnaires. These lower self- 
reported citizenship rates are due to missing responses on the ACS, not misclassification. As we noted 
above, the item nonresponse rate for the citizenship question has been increasing. These item nonresponse 
data show that some citizens are not reporting their status on the ACS at all. In 2010 and 2016, 
individuals for whom the administrative data indicate noncitizen respond citizen in 32.7% and 34.7% of 
the ACS questionnaires, respectively. The rates of missing ACS citizenship response are also greater for 
individuals who are noncitizens in the administrative data (2010: 4.1%, 2016: 7.7%). The analysis of 
reference persons, relatives, and nonrelatives is qualitatively identical to the 2000 Census analysis. 

In all three analyses, the results for racial and ethnic groups and for voting age individuals are similar to 
the results for the whole population with one important exception. If the administrative data indicate that 
the person is a citizen, the self-report is citizen at a very high rate with the remainder being predominately 
missing self-reports for all groups. If the administrative data indicate noncitizen, the self-report is citizen 
at a very high rate (never less than 23.8% for any racial, ethnic or voting age group in any year we 
studied). The exception is the missing data rate for Hispanics, who are missing administrative data about 
twice as often as non-Hispanic blacks and three times as often as non-Hispanic whites. 

C.2. Analysis of coverage differences between administrative and survey citizenship data 

Our analysis suggests that the ACS and 2000 long form survey data have more complete coverage of 
citizenship than administrative record data, but the relative advantage of the survey data is diminishing. 
Citizenship status is missing for 10.9 percent of persons in the 2016 administrative records, and it is 
missing for 6.3 percent of persons in the 2016 ACS. This 4.6 percentage point gap between administrative 
and survey missing data rates is smaller than the gap in 2000 (6.9 percentage points) and 2010 (5.6 
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percentage points). Incomplete (through November) pre-production ACS data indicate that citizenship 
item nonresponse has again increased in 2017. 

There is an important caveat to the conclusion that survey-based citizenship data are more complete than 
administrative records, albeit less so now than in 2000. The methods used to adjust the ACS weights for 
survey nonresponse and to allocate citizenship status for item nonresponse assume that the predicted 
answers of the sampled non-respondents are statistically the same as those of respondents. Our analysis 
casts serious doubt on this assumption, suggesting that those who do not respond to either the entire ACS 
or the citizenship question on the ACS are not statistically similar to those who do; in particular, their 
responses to the citizenship question would not be well-predicted by the answers of those who did 
respond. 

The consequences of missing citizenship data in the administrative records are asymmetric. In the Census 
Numident, citizenship data may be missing for older citizens who obtained SSNs before the 1972 
requirement to verify citizenship, naturalized citizens who have not confirmed their naturalization to SSA, 
and noncitizens who do not have an SSN or ITIN. All three of these shortcomings are addressed by 
adding data from the United States Citizen and Immigration Services (USCIS). Those data would 
complement the Census Numident data for older citizens and update those data for naturalized citizens. A 
less obvious, but equally important benefit, is that they would permit record linkage for legal resident 
aliens by allowing the construction of a supplementary record linkage master list for such people, who are 
only in scope for the Numident if they apply for and receive an SSN or ITIN. Consequently, the 
administrative records citizenship data would most likely have both more accurate citizen status and 
fewer missing individuals than would be the case for any survey-based collection method. Finally, having 
two sources of administrative citizenship data permits a detailed verification of the accuracy of those 
sources as well. 

C.3. Cost of administrative record data production 

For Alternative C, we estimate that the incremental cost, except for new MOUs, is $450,000. This cost 
estimate includes the time to develop an MOU with USCIS, estimated ingestion and curation costs for 
USCIS data, incremental costs of other administrative data already in use in the 2020 Census but for 
which continued acquisition is now a requirement, and staff time to do the required statistical work for 
integration of the administrative-data citizenship status onto the 2020 Census microdata. This cost 
estimate is necessarily incomplete because we have not had adequate time to develop a draft MOU with 
USCIS, which is a requirement for getting a firm delivery cost estimate from the agency. Acquisition 
costs for other administrative data acquired or proposed for the 2020 Census varied from zero to $1.5M. 
Thus the realistic range of cost estimates, including the cost of USCIS data, is between $500,000 and 
$2.0M 
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Preliminary Analysis of Alternative D 

At the Secretary’s request we performed a preliminary analysis of combining Alternative B (asking the 
citizenship question of every household on the 2020 Census) and Alternative C (do not ask the question, 
li nk reliable administrative data on citizenship status instead) in the January 19, 2018 draft memo to the 
Department of Commerce into a new Alternative D. Here we discuss Alternative D, the weaknesses in 
Alternative C on its own, whether and how survey data could address these weaknesses, implications of 
including a citizenship question for using administrative data, and methodological challenges. 

Description of Alternative D: Administrative data from the Social Security Administration (SSA), 

Internal Revenue Service (IRS), U.S. Citizenship and Immigration Services (USC1S), and the State 
Department would be used to create a comprehensive statistical reference list of current U.S. citizens. 
Nevertheless, there will be some persons for whom no administrative data are available. To obtain 
citizenship information for this sub-population, a citizenship question would be added to the 2020 Census 
questionnaire. The combined administrative record and 2020 Census data would be used to produce 
baseline citizenship statistics by 2021. Any U.S. citizens appearing in administrative data after the version 
created for the 2020 Census would be added to the comprehensive statistical reference list. There would 
be no plan to include a citizenship question on future Decennial Censuses or American Community 
Surveys. The comprehensive statistical reference list, built from administrative records and augmented by 
the 2020 Census answers would be used instead. The comprehensive statistical reference list would be 
kept current, gradually replacing almost all respondent-provided data with verified citizenship status data. 

What are the weaknesses in Alternative C? 

In the 2017 Numident (the latest available), 6.6 million persons born outside the U.S. have blank 
citizenship among those born in 1920 or later with no year of death. The evidence suggests that 
citizenship is not missing at random. Of those with missing citizenship in the Numident, a much higher 
share appears to be U.S. citizens than compared to those for whom citizenship data are not missing. 
Nevertheless, some of the blanks may be noncitizens, and it would thus be useful to have other sources 
for them. 

A second question about the Numident citizenship variable is how complete and timely its updates are for 
naturalizations. Naturalized citizens are instructed to immediately apply for a new SSN card. Those who 
wish to work have an incentive to do so quickly, since having an SSN card with U.S. citizenship will 
make it easier to pass the E-Verify process when applying for a job, and it will make them eligible for 
government programs. But we do not know what fraction of naturalized citizens actually notify the SSA, 
and how soon after being naturalized they do so. 

A third potential weakness of Numident citizenship is that some people are not required to have a Social 
Security Number (SSN), whether they are a U.S. citizen or not. It would also be useful to have a data 
source on citizenship that did not depend on the SSN application and tracking process inside SSA. This is 
why we proposed the MOU with the USC1S for naturalizations, and why we have now begun pursuing an 
MOU with the State Department for data on all citizens with passports. 
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IRS Individual Taxpayer Identification Numbers (ITIN) partially fill the gap in Numident coverage of 
noncitizen U.S. residents. However, not all noncitizen residents without SSNs apply for ITINs. Only 
those making IRS tax filings apply for ITINs. Once again, it would be useful to have a data source that 
did not depend on the ITIN process. The USCIS and State Department MOUs would provide an 
alternative source in this context as well. 

U.S. Citizenship and Immigration Services (USCIS) data on naturalizations, lawful permanent residents, 
and 1-539 non-immigrant visa extensions can partially address the weaknesses of the Numident. The 
USCIS data provide up-to-date information since 2001 (and possibly back to 1988, but with incomplete 
records prior to 2001). This will fill gaps for naturalized citizens, lawful permanent residents, and persons 
with extended visa applications without SSNs, as well as naturalized citizens who did not inform SSA 
about their naturalization. The data do not cover naturalizations occurring before 1988, as well as not 
covering and some between 1988-2000. USCIS data do not always cover children under 18 at the time a 
parent became a naturalized U.S. citizen. Such children automatically become U.S. citizens under the 
Child Citizenship Act of 2000. The USCIS receives notification of some, but not all, of these child 
naturalizations. Others infonn the U.S. government of their U.S. citizenship status by applying for U.S. 
passports, which are less expensive than the application to notify the USCIS. USCIS visa applications list 
people’s children, but those data may not be in electronic form. 

U.S. passport data, available from the State Department, can help plug the gaps for child naturalizations, 
blanks on the Numident, and out-of-date citizenship information on the Numident for persons naturalized 
prior to 2001. Since U.S. citizens are not required to have a passport, however, these data will also have 
gaps in coverage. 

Remaining citizenship data gaps in Alternative C include the following categories: 

1. U.S. citizens from birth with no SSN or U.S. passport. They will not be processed by the 
production record linkage system used for the 2020 Census because their personally identifiable 
information won’t find a matching Protected Identification Key (PIK) in the Person Validation System 
(PVS). 

2. U.S. citizens from birth bom outside the U.S., who do not have a U.S. passport, and either applied 
for an SSN prior to 1974 and were 18 or older, or applied before the age of 18 prior to 1978. These people 
will be found in PVS, but none of the administrative sources discussed above will reliably generate a U.S. 
citizenship variable. 

3. U.S. citizens who were naturalized prior to 2001 and did not inform SSA of their naturalization 
because they originally applied for an SSN after they were naturalized, and it was prior to when 
citizenship verification was required for those bom outside the U.S. (1974). These people already had an 
SSN when they were naturalized and they didn’t inform SSA about the naturalization, or they didn’t 
apply for an SSN. The former group have inaccurate data on the Numident. The latter group will not be 
found in PVS. 

4. U.S. citizens who were automatically naturalized if they were under the age of 18 when their 
parents became naturalized in 2000 or later, and did not inform USCIS or receive a U.S. passport. Note 
that such persons would not be able to get an SSN with U.S. citizenship on the card without either a U.S. 
passport or a certificate from USCIS. These people will also not be found in the PVS. 
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5. Lawful permanent residents (LPR) who received that status prior to 2001 and either do not have 
an SSN or applied for an SSN prior to when citizenship verification was required for those born outside 
the U.S. (1974). The former group will not be found in PVS. The latter group has inaccurate data in 
Numident. 


6. Noncitizen, non-LPR, residents who do not have an SSN or IT1N and who did not apply for a visa 
extension. These persons will not be found in PVS. 

7. Persons with citizenship information in administrative data, but the administrative and decennial 
census data cannot be linked due to missing or discrepant Pll. 

Can survey data address the gaps in Alternative C? 

One might think that survey data could help fill the above gaps, either when their person record is not 
linked in the PVS, and thus they have no P1K, or when they have a P1K but the administrative data lack 
up-to-date citizenship information. Persons in Category 6, however, have a strong incentive to provide an 
incorrect answer, if they answer at all. A significant, but unknown, fraction of persons without PIKs are in 
Category 6. Distinguishing these people from the other categories of persons without PIKs is an inexact 
science because there is no feasible method of independently verifying their non-citizen status. Our 
comparison of ACS and Numident citizenship data suggests that a large fraction of LPRs provide 
incorrect survey responses. This suggests that survey-collected citizenship data may not be reliable for 
many of the people falling in the gaps in administrative data. This calls into question their ability to 
improve upon Alternative C. 

With Alternative C, and no direct survey response, the Census Bureau’s edit and imputation procedures 
would make an allocation based primarily on the high-quality administrative data. In the presence of a 
survey response, but without any linked administrative data for that person, the edit would only be 
triggered by blank citizenship. A survey response of “citizen” would be accepted as valid. There is no 
scientifically defensible method for rejecting a survey response in the absence of alternative data for that 
respondent. 

How might inclusion of a citizenship question on the questionnaire affect the measurement of citizenship 
with administrative data? Absent an in-house administrative data census, measuring citizenship with 
administrative data requires that persons in the Decennial Census be linked to the administrative data at 
the person level. The PVS system engineered into the 2020 Census does this using a very reliable 
technology. However, inclusion of a citizenship question on the 2020 Census questionnaire is very likely 
to reduce the self-response rate, pushing more households into Nonresponse Followup (NRFU). Not only 
will this likely lead to more incorrect enumerations, but it is also expected to increase the number of 
persons who cannot be linked to the administrative data because the NRFU PII is lower quality than the 
self-response data. In the 2010 Decennial Census, the percentage of NRFU persons who could be linked 
to administrative data rate was 81.6 percent, compared to 96.7 percent for mail responses. Those refusing 
to self-respond due to the citizenship question are particularly likely to refuse to respond in NRFU as 
well, resulting in a proxy response. The NRFU linkage rates were far lower for proxy responses than self¬ 
responses (33.8 percent vs. 93.0 percent, respectively). 


Although persons in Category 6 will not be linked regardless of response mode, it is common for 
households to include persons with a variety of citizenship statuses. If the whole household does not self- 
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respond to protect the members in Category 6, the record linkage problem will be further aggravated. 
Thus, not only are citizenship survey data of suspect quality for persons in the gaps for Alternative C, 
collecting these survey data would reduce the quality of the administrative records when used in 
Alternative D by lowering the record linkage rate for persons with administrative citizenship data. 

What methodological challenges are involved when combining these sources? 

Using the 2020 Census data only to fill in gaps for persons without administrative data on citizenship 
would raise questions about why 100 percent of respondents are being burdened by a citizenship question 
to obtain information for the two percent of respondents where it is missing. 

Including a citizenship question in the 2020 Census does not solve the problem of incomplete person 
linkages when producing citizenship statistics after 2020. Both the 2020 decennial record and the record 
with the person’s future location would need to be found in PVS to be used for future statistics. 

In sum, Alternative D would result in poorer quality citizenship data than Alternative C. It would still 
have all the negative cost and quality implications of Alternative B outlined in the draft January 19, 2018 
memo to the Department of Commerce. 
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Expert Disclosure of John M. Abowd 
October 3, 2018 


I. Introduction 

Qualifications 

I am the Chief Scientist and Associate Director for Research and Methodology at the United States Census 
Bureau. I have served in that capacity since June 2016. My position is covered by an Intergovernmental 
Personnel Act (IPA) agreement between Cornell University and the Census Bureau. At Cornell, I am the 
Edmund Ezra Day professor of economics, professor of statistics and information science, and director of 
the Labor Dynamics Institute. 

In 1977, I received my Ph.D. in economics from the University of Chicago with specializations in 
econometrics and labor economics. My B.A. in economics is from the University of Notre Dame. 

I have been a university professor since 1976. My first appointment was assistant professor of economics 
at Princeton University. I was also assistant and associate professor of econometrics and industrial 
relations at the University of Chicago Graduate School of Business. In 1987, I was appointed associate 
professor of industrial and labor relations with indefinite tenure at Cornell University, where I am still 
employed. 

I am a member and fellow of the American Statistical Association, Econometric Society, and Society of 
Labor Economists (president 2014). I am an elected member of the International Statistical Institute. I am 
also a member of the American Economic Association, International Association for Official Statistics, 
National Association for Business Economists, American Association for Public Opinion Research, and 
American Association of Wine Economists. I regularly attend and present papers at the meetings of all of 
these organizations. 

I currently serve on the American Economic Association Committee on Economic Statistics. I have also 
served on the National Academy of Sciences Committee on National Statistics, the Conference on 
Research in Income and Wealth Executive Committee, and the Bureau of Labor Statistics Technical 
Advisory Board for the National Longitudinal Surveys (chair: 1999-2001). 

Relevant professional experience 

In 1998, the Census Bureau and Cornell University entered into the first of a sequence of IPAs and other 
contracts under which I served continuously as Distinguished Senior Research Fellow at the Census Bureau 
until I assumed my current position in 2016, under a new IPA contract. While I was a senior research 
fellow, I worked with numerous senior executives. This includes Directors (Martha Riche, Kenneth Prewitt, 
C. Louis Kincannon, Stephen Murdoch, Robert Groves, and John Thompson), Deputy Directors (Hermann 
Habermann, Thomas Mesenbourg, and Nancy Potok), Chief Scientists (Roderick Little and Thomas Louis), 
and numerous other associate directors, assistant directors, and division chiefs. I also worked with Chief 
Economists John Haltiwanger, J. Bradford Jensen, Daniel Weinberg, and Lucia Foster, and researchers in 
all program areas. 
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I was one of three senior researchers who founded the Longitudinal Employer-Household Dynamics 
(LEHD) program at the Census Bureau. This program produces detailed public-use statistical data on the 
characteristics of workers and employers in local labor markets using large-scale linked administrative, 
census and survey data from many different sources. The program is acknowledged as the Census 
Bureau's first 21 st Century data product: built to the specifications of local labor market specialists without 
additional survey burden, and published using state-of-the-art confidentiality protection. In addition to 
very substantial financial support from the Census Bureau, this project was supported by a $4.1 million 
grant from the National Science Foundation (NSF) on which I was the lead Principal Investigator. 

From 2004 through 2009, I was the lead Principal Investigator on the $3.3 million NSF-supported 
collaborative project with the Census Bureau to modernize secure access to confidential social science 
data. This project led to the first production implementation worldwide of differential privacy 1 for 
OnTheMap—a product of the LEHD program. It also produced prototype confidential data access systems 
with public-use synthetic micro-data supported by direct analysis of the confidential data on validation 
servers. These projects were the precursors to the Census Bureau's current program to implement central 
differential privacy for all publications from the 2020 Census of Population and Housing, which will be the 
first large-scale production implementation worldwide. 

From 2011 until I assumed my position as Chief Scientist at the Census Bureau in 2016,1 was the Principal 
Investigator of the Cornell University node of the NSF-Census Research Network (NCRN), one of eight such 
nodes that worked collaboratively with the Census Bureau and other federal statistical agencies to identify 
important theoretical and applied research projects of direct programmatic importance to the agencies. 
The Cornell node produced the fundamental science explaining the distinct roles of statistical 
policymakers and computer scientists in the design and implementation of differential privacy systems at 
statistical agencies. 

I have published more than 100 scholarly books, monographs, and articles in the disciplines of economics, 
econometrics, statistics, computer science, and information science. I have been the principal investigator 
or co-principal investigator on 35 sponsored research projects. My full Curriculum Vitae is attached to this 
report. 

What I was asked to analyze 

I was asked to provide expert analysis in three areas: 

1. Is there credible quantitative evidence that the addition of a citizenship question on the 2020 
Census would affect the cost and quality of that census? 

2. Are the activities of the Census Bureau appropriate and adequate to address any cost and quality 
consequences that might arise during the conduct of the 2020 Census? 

3. Did the Census Bureau follow appropriate statistical quality standards when it placed the 
citizenship question from the American Community Survey onto the proposed questionnaire in 
the 2020 Census without further testing? 


1 Differential privacy is the leading privacy-enhancing data publication method in computer science. 
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Key conclusions 

1. The Census Bureau produced credible quantitative evidence that the addition of a citizenship 
question to the 2020 Census could be expected to lower the self-response rate in an identifiable 
and large sub-population—households that may contain non-citizens. The lower self-response 
rate can be expected to increase Nonresponse Followup (NRFU) costs and lower the quality of 
census data other than the count itself. Therefore, the Census Bureau can and will make 
appropriate adjustments to various components of the 2020 Census, including NRFU and the 
Integrated Partnership and Communications Program to mitigate these effects. 

2. Neither the Census Bureau nor any external expert has produced credible quantitative evidence 
that the addition of a citizenship question to the 2020 Census would increase the net undercount 
or increase differential net undercounts for identifiable sub-populations. Therefore, there is no 
credible quantitative evidence that the addition of the citizenship question would affect the 
accuracy of the count. 

3. The citizenship question on the American Community Survey was thoroughly tested, most 
recently in 2006. Neither the Census Bureau's Quality Standards nor the Office of Management 
and Budget Statistical Policy Directives require further testing of this question before it can be 
used on the 2020 Census. If the OMB believes that further testing is necessary, it may request and 
provide clearance for such testing before issuing the clearance for the 2020 Census. 

II. Quantitative evidence on the effects of the citizenship question 

The purpose of the Decennial Census of Population and Flousing is to conduct an actual enumeration of 
the population and disseminate the results to the President, the states, and the American people. The 
Census Bureau conducts the census in the 50 states, the District of Columbia, Puerto Rico, American 
Samoa, Guam, the Northern Mariana Islands, and the U.S. Virgin Islands. When conducting a decennial 
census, our goal is to count everyone once, only once, and in the right place. 

The 2020 Census has been in testing, development and implementation for almost a full decade. On 
December 12, 2017, the Department of Justice requested the addition of a question on citizenship for the 
purpose of producing block-level statistics on the citizen voting-age population in support of enforcement 
of the Voting Rights Act. On March 26, 2018, the Secretary of Commerce instructed the Census Bureau to 
add a question on citizenship to the 2020 Census. 

In the course of the deliberations and research that occurred at the Census Bureau between December 
15, 2017, when we were notified of the Department of Justice (DoJ) request, and the present, I supervised 
the preparation of a sequence of technical responses to the DoJ request (AR 1277-1285, 1308-1312) and 
the work of a team of researchers who subsequently released a technical working paper in August 2018 
(COM_DIS00009833-989). I will only summarize them here. 

First, at the time those memos and research papers were written, I was not aware of any randomized 
controlled trial (RCT) that provided credible quantitative information about the effects of the addition of 
a citizenship question on the net undercount in the decennial census. That is still the case. Randomized 
controlled trials are the gold standard for internal validity, and none exist that can address the potential 
consequences for net undercount (the coverage measure of choice for assessing the accuracy of a 
decennial census). Even if such an RCT had existed, there would remain the question of generalizability of 
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its results. However, disagreement about the generalizability of an internally valid RCT estimate of an 
effect of the citizenship question on the net undercount should be a discussion based on specific evidence 
rather than an expert opinion based on accumulated experience. 

Second, the internal Census Bureau research relies on an alternative to RCTs, called a natural experiment 
or difference-in-difference estimator, to quantify the potential effect of a citizenship question on the unit 
self-response rate —the rate at which households voluntarily complete the census questionnaire and 
return it to the Census Bureau. The research statistically isolates a particular sub-population—households 
that contain at least one non-citizen or at least one person with unknown citizenship status—and 
compares it to a different sub-population—households that contain only citizens. The details of the way 
those sub-populations were isolated can be found in the technical paper. The salient result is that 
households containing at least one noncitizen or person of unknown citizenship status may be less likely 
to self-respond to the 2020 Census if it contains a question on citizenship. Putting the question on the 
census is therefore likely to depress self-response on average if the control group—households that 
contain all citizens—do not change their self-response rates. Because we must rely on a natural 
experiment, however, we have no evidence on control group behavior. That is because we cannot design 
the estimator to produce the quantity we seek to address (overall effect on self-response) and must work 
with the quantity we can estimate (the differential effect on self-response in the households with non¬ 
citizens compared to households with citizens). These estimates of the effect of the presence of a 
citizenship question on self-response rates are used in the next section to estimate the increased NRFU 
costs (discussed below). 

It is important to stress that the estimated decrease in self-response rates does not translate into an 
increase in net undercount, and the use of our estimates as if they did is wholly inappropriate. Controlling 
net undercount depends critically on the Census Bureau's ability to fully enumerate the housing stock in 
the country, and then to determine which housing units are occupied, vacant, or nonexistent. Once a 
housing unit is known to be occupied, the quality of the data recorded for the occupants of that housing 
unit depends critically on self-response. Voluntary self-response produces much more accurate measures 
of the age, sex and other variables measured by the questionnaire. This is distinct from the process by 
which the Census Bureau ensures that it gets an accurate count in the NRFU operation (as measured by 
the net undercount statistics in the coverage evaluation program). 

Third, our research clearly showed that there is a serious issue regarding the accuracy of self-reported 
citizenship status. We did this by using record linkage methods to compare the answers on surveys to the 
citizenship status recorded in high quality administrative data. For individuals identified as citizens in the 
administrative data and who answer the citizenship question in the ACS, over 99 percent self-report that 
they are citizens. For individuals identified as noncitizens in the administrative data, a substantial minority 
(30 to 35 percent, depending on the year) report that they are citizens. 

Given the cost and data-quality concerns, the Census Bureau consistently recommended using 
administrative records rather than a citizenship question. However, this recommendation does not imply 
that asking the citizenship question will result in a less accurate count. We have no credible quantitative 
evidence to support that conclusion. 
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III. Nonresponse followup consequences of the citizenship question 

Nonresponse followup (NRFU) is the largest of the decennial census field data collection operations. The 
primary purpose of NRFU is to conduct in-person contact attempts at each and every housing unit address 
that did not provide a response to the decennial census questionnaire using an online questionnaire, by 
returning a completed paper questionnaire, or by providing response information to a Census customer 
service representative over the telephone. We estimate, after providing approximately six weeks for 
individuals to respond, that the self-response rate will be 60.5 percent of all housing units. 2 This self¬ 
response rate estimate means that we also estimate that 39.5 percent of the housing unit addresses in 
the universe will not initially respond. 3 In NRFU, field representatives (known as enumerators) attempt 
to locate each nonresponding housing unit address, determine its status (occupied, vacant, non-existent), 
and for occupied housing unit addresses conduct an interview with a knowledgeable person who can 
provide responses to the decennial census questionnaire. 

The Census Bureau is prepared to conduct the 2020 Census NRFU operation and believes that those 
efforts will result in a complete enumeration. The Census Bureau has demonstrated the ability to 
successfully conduct a NRFU operation in previous censuses and in the 2018 End-to-End Census Test, the 
last field test prior to the 2020 Census. It has tested the operational design in evaluations over the course 
of the decade. The evaluations, along with historical data from past censuses and the American 
Community Survey, have informed the Census Bureau's operational design and the assumptions 
supporting that design. These evaluations have identified factors that could impact the operational 
implementation of NRFU. They have also provided evidence on the effects of an operational outcome 
such as a lower than estimated self-response rates. 4 Contingency funding to handle deviations from the 
planned operations are built in to the Life Cycle Cost Estimate (LCCE). The decision to include a question 
on citizenship has not impacted the NRFU operational design, but it will modify the execution of that 
design, if the self-response rate at the start of NRFU is below the estimate built into the LCCE. As 
documented in Section II, there is no evidence, to date, that the addition of the citizenship question will 
result in a less accurate enumeration. We are, however, prepared to react, adjust, and complete NRFU 
to ensure an accurate count and deliver the highest quality census data. 

Background 

To understand how the NRFU efforts work, one must first understand the basic methodology used for 
counting individuals for purposes of the decennial census. To conduct the census, the Census Bureau 
must consider all places where someone lives or could live as of April 1, 2020 (Census Day). We classify 
these places as one of two types of living quarters: housing units and group quarters. Living quarters are 
usually found in structures intended for residential use, but also may be found in structures intended for 
nonresidential use as well as in places such as tents, vans, hotels/motels, and emergency and transitional 
shelters. 


2 U.S. Census Bureau (2017d) page 15. 

3 Calculated value -100 percent minus 60.5 percent. 

4 For example, preliminary analysis of the 2018 End-to-End Census Test suggests that shortfalls in recruiting NRFU 
enumerators can be partially or fully offset by efficiency gains from the Field Operational Control System. 
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quest ions. 

BY MR. TILAK: 

Q Is natural experiment an accepted method 
of research in social science? 

A Yes . 

Q Do you believe a decline in self-response 

rates in households with at least one noncitizen 
will result in a higher undercount for noncitizen 
households? 

A We don't have any evidence to suggest 
that hypothesis is true. 

Q Do you have any evidence to suggest the 
hypothesis is false? 

A No . 

Q So you just don't know, one way or the 
other? 

A No. We think we know. We believe that 
the net undercount — I put the net before your 
undercount, but I'm putting it there 
exclusively — that the net undercount depends 
primarily on the energy and efficacy of the 
nonresponse follow-up efforts. That is a lesson 
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learned over multiple censuses where we have 
beaten that net undercount down by perfecting 
processes that get at least some information about 
the house- — household — housing unit. 

The critical piece of information is how 
many people live there. So if you can determine 
that, then the rest of what's happening is you 
don't know anything about them, so the quality of 

the analyses you're going to do on anything other 
than the population count is affected. 

And truth in discussion, we know that 
there's a differential in that undercount. We 
make active efforts to abate that, and we have no 
evidence that the differential undercount is 
related to the presence of a citizen question, but 
it is related to the macroenvironment when you 
conduct the census. And that's not something you 
can do a randomized controlled trial on. 

Q Are you aware of any analysis or research 
looking at the relationship between self-response 
and undercount? 

A Again, net undercount, no. 
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that I might not agree with. 

Q Let's turn to page 2 of the white paper. 
Bates COM_DIS09834. The last sentence of the 
abstract reads, "The evidence in this paper also 
suggests that adding a citizenship question to the 
2020 census would lead to lower self-response 
rates in households potentially containing 
non-citizens, resulting in higher field work costs 
and a lower quality population count." 

Did I read that accurately? 

A Yes, you did. 

Q Does the Census Bureau agree that the 

balance of evidence available suggests that adding 
a citizenship question to the 2020 census would 
lead to lower self-response rates in households 
potentially containing non-citizens? 

A Yes . 

Q Does the Census Bureau agree that the 

balance of evidence available suggests that adding 

a citizenship question to the 2020 census would 
lead to a lower quality population count? 

A I have to define lower quality population 
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count to answer that question. May I? 

Q Yes, please. 

A So the usual accuracy measures are two: 

Net undercount and then its components, gross 
omissions and erroneous enumerations and 
whole-person census imputations. We have no 
evidence that it would affect the quality as 
regards net undercount. We have evidence that it 
would affect the count — the quality as regards 
components of the errors in the enumeration. 

Q We'll get back to that. Thank you for 

Could you turn to page 8 in the white 
paper, Bates number COM_DIS09840? And I want to 
look at figure 1, panel A. This graph shows item 
non-response, which is the failure to answer 
certain questions, on the American Community 
Survey, or ACS, in the year 2016, broken down by 
various racial and ethnic subgroups; is that 
correct ? 

A Racial, ethnic and demographic subgroups, 

ye s . 
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It does not mean net undercount. 

THE REPORTER: Could you please repeat 

your answer. 

THE WITNESS: Accurate enumeration in 

this sentence means enumeration errors and 
whole-person census imputations. It does not mean 
net undercount. 

BY MR. HO: 

Q Now, if you send an in-person enumerator 

to a household that doesn't self-respond and that 
doesn't result in a response, one way that you 
could — another way you could have of enumerating 
that household is through a proxy response, which 
means trying to obtain a response from someone who 
is not a member of that household about that 
household, correct? 

A Yes . 

Q And the Census Bureau agrees that proxy 

enumeration generally results in lower quality 

enumeration data than self-responses, correct? 

• Yes. 

Q And the Census Bureau agrees that a proxy 
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response is more likely to result in the omission 
of a household member than a self-response, 
correct ? 

A I haven't looked at the table recently, 
but I believe that's correct, yes. 

Q Let's go to the white paper again. And I 
want to look at page 12, Bates number 
COM_DIS 0 9 8 4 4 , figure 3. 

A Figure 3, did you say? 

Q I believe so. On page 12? 

A Okay. I thought I heard 4. 

Q Okay. Figure 3 depicts unit non-response 

to the ACS from 2010 through 2016 comparing census 
tracts with the lowest decile of housing units 
containing a non-citizen to the census tracts in 
the highest decile of housing units containing a 
non-citizen, correct? 

A Correct. 

Q And for each year of ACS depicted here, 
census tracts in the highest decile of housing 
units containing a non-citizen have a lower 
response rate to the ACS than do census tracts in 
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Q You mentioned differences between NRFU 
efforts in 2020 versus NRFU efforts in 2010. 

Could you elaborate on what those differences are 
anticipated to be? 

A The major differences in the NRFU from 
2020 as compared to 2010 are the extensive use of 
administrative records at both the determination 
of occupied, vacant, delete, and potentially for 
enumeration after the first non-response 
f o1low-up. 

Q Any other differences in the NRFU efforts 
planned for 2020 versus 2010 other than the use of 
administrative records for enumeration purposes? 

A The field operations are controlled by a 
field operational control system that contains a 
very extensive route optimizer that we tested all 
decade. 

(Discussion held off the record.) 

BY MR. HO: 

Q Backing up for a moment, Dr. Abowd, does 
the Census Bureau believe that it is reasonable to 
be spending the increased amounts of money that it 
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will be forced to spend, and staff time, due to 
the citizenship question being included on the 
decennial questionnaire given the utility of the 
data that will be on it? 

A The Census Bureau has been instructed to 
include a citizenship question on the 2020 census 
and has attempted to quantify the consequences of 
that for the operations of the 2020 census. That 
quantification suggests increases in the 
non-response follow-up costs and a deterioration 

in the quality of the response data. And we are 
prepared to conduct the census with those extra 
resources in NRFU and taking account of the change 
in the quality of the data. 

Q Dr. Abowd, you testified that one of the 
reasons why the Census Bureau rejected the RCT 
proposal is that it didn't make sense from a 
cost-benefit perspective, correct, in the view of 
the Census Bureau? 

A Correct. 

Q In the view of the Census Bureau, does it 

make sense from a cost-benefit perspective to add 
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enter into the computation of net undercount in a 
very complicated way, and as a consequence, I have 
not been able to make a reasonable, credible, 
quantitative evidence of the effect on undercount. 

I accept that it's possible that the 
undercount will go up. I accept that it's 
possible that the undercount will go down. I 
provided evidence that components of the 
undercount will change, but those components don't 
enter into the calculation of the undercount with 
the same sign. So when one of them changes, you 
have to also compensate by calculating the changes 
in the others. And if you don't have magnitude 
estimates, you can't get an estimate for the net 
undercount effect. 

So -- 

Q. Thank you. We'll — 

A. So you can't say it's likely without that 
qualitative evidence. It's certainly possible. 

And the quality degradation associated with the 
components is documented and was sufficient for 
us, as a bureau, to recommend that the question 
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not be asked. But we don't have quantitative 
evidence that the undercount will likely increase. 

Q. Thank you. We'll be spending some time 
most likely this afternoon going through some of 
that in more detail. 

Why don't you continue with 
Dr. Thompson's report. 

A. In the following paragraph, I don't 
take — I don't take issue with his summary of the 
facts or the evidence that's put forward. He 
concludes, "These facts all strongly suggest that 
NRFU efforts may be unsuccessful with respect to 
households that decline to answer the decennial 
census questionnaire because of the citizenship 
question, particularly non-citizen households." 

I've hash-tagged that number 4. 

I'm going to have to presume that 
Mr. Thompson means by "unsuccessful" that the same 
measures that we would use inside the Census 
Bureau as unsuccessful, which means that the 
address in the NAF file passes all the way to 
count imputation. 


Veritext Legal Solutions 

215 - 241-1000 ~ 610 - 434-8588 ~ 302 - 571-0510 ~ 202 - 803-8830 









1 

m 

m 

9 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 

21 

22 


Case 3:18-cv-01865-RS Dc^urp^nt^^^^le^jQ^/OS/lQ Page 39 of 59 


Page 42 

If he means by that conclusion that the 
quality of the data produced in NRFU is not 
comparable to the quality of the data produced in 
self-response, I completely agree. If he means 
that more cases will pass to count imputation, I 
accept that that's a possibility. But at the 
moment, the fraction of cases passing to count 
imputation is expected to be very small, and I'm 
prepared to discuss why we think it will be very 
small with or without the citizenship question on 
the 2020 census. 

Q. Okay. Thank you. Why don't you proceed. 

A. That's all. 

Q. So beyond the four points you identified, 
you're not planning to express any opinions 
critical of Dr. Thompson's report at trial, 
correct ? 

A. That's correct. 

Q. Okay. Have you reviewed the — you can 
set 3 a side. 

Have you reviewed the September 7th 
report of Dr. Hermann Habermann? 
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and identify any points of disagreement. 

A. Okay. In his rebuttal report, 
paragraph 3 — do you want me to use a separate 
numbering set now? Start with 1 again? 

Q. Why don't we start with 1 again. And 
just so the record is clear — this was clear 
before the break, but you're marking Exhibit 6. 

A. I am marking Exhibit 6, the expert 

rebuttal report of Matthew A. Barreto, Ph.D. And 
I have just marked paragraph 3 with my number 1. 

I disagree with the entire paragraph. I 
offered specific quantitative evidence. I 
demonstrated the adequacy of the testing of the 
citizenship qu e s tio n. " 

Q. Okay. If you could continue. 

A. I disagree with paragraph number 4. I'm 

marking it number 2. It says that I haven't said 

why I reject his arguments, but I have now. 

Do you want me to repeat those? 

Q. No. I think the question is — the 
observation is whether you've critiqued them in 
your report. 
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A. Oh, I can — I acknowledge that I did not 
critique them in my report. 

Q. Okay. 

A. Do you want me to cross that out? 

Q. It's fine. The record is clear this way. 

A. I didn't realize that the paragraph 
continued on the next page, so I may want to make 
specific comments about it. 

Yes, I'm going to make the specific 
comment that my expert report, and many of the 

other expert reports in this case, acknowledge the 
data quality issues associated with lower 
self-response. And I believe that my expert 
report provides credible quantitative evidence 
that lower self-response is something that we 
should expect. 

There are specific consequences of lower 
self-response that can be quantified, and I 
believe that I've quantified them. The net 
undercount — I know we're going to get to this, 
but the net undercount has not been quantified, 
either by Dr. Barreto or me. And when I say no 
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credible evidence that the citizenship question 
will have a bearing on the net undercount has been 
entered, that's what I mean. 

I object to paragraph 5. I'm marking it 
with number 3. I have acknowledged that missing 
data imputation with respect to characteristics is 
less accurate in non-response follow-up. I have 
not produced, and Dr. Barreto has not produced, 
any quantitative evidence to show that that 
applies — credible quantitative evidence to show 
that that applies to count imputation, nor to show 
that more cases will get to count imputation. 

I disagree with paragraph 7. I'm 
labeling it number 4 with my initials. 

What my expert report asserts is that the 
design of the NRFU operation and the budgetary 
envelope that it's going to be conducted under are 
sufficient to deliver data of the accuracy that 
that non-response follow-up operation was designed 
to produce. That's not the same thing as saying 
that they will produce data of accuracy that the 
parameters that were in place before the insertion 
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Why don't we continue with Dr. Barreto. 

A. Okay. I disagree with paragraph 12. I'm 
labeling it — 

Q. You're up to number 8. 

A. -- 8. 

As far as I'm aware, these studies do not 
show a systematic bias in net undercount 
identified from their study of particular family 
situations. 

I accept the conclusion that these 
neighborhoods are difficult — more difficult to 
get characteristic data in. 

I disagree with paragraph 13. I'm 
labeling it number 9. It's a repeat of his 
disagreement with my claim that there's no 
credible evidence that the net undercount will go 
up. And I've already defended that. So I 
disagree with this paragraph for the same reason 
that I've already stated. 

I disagree with paragraph 14. I'm 
labeling it number 10. I don't believe that the 
RCT that was embedded in the survey that 
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Dr. Barreto ran is strong enough to meet the 
randomized control trial criterion that we would 
normally impose. However, I accept the evidence 
that se1f-response rates in that survey were lower 
for the group that got the treatment. Professed 
se1f-response rates were lower. 

So I don't think we have an argument over 
whether that could be used to get some evidence 
about se1f-response rates. We have an argument 
over the subsequent conclusions. 

I disagree with paragraph 15. I'm 
labeling it number 11. The reasons are now 
getting repetitive, but basically I don't think he 
properly analyzed the relationship of his survey 
to the non-response follow-up operation on the 
2020 census. And the experts in the case, 
including Dr. Barreto, are in agreement about the 
degradation in the quality of the characteristics. 
We appear to be primarily in disagreement about 

whether you can make a quantitative inference 


about the net undercount. And I continue to 
disagree with his inferences about the net 
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unde r count. 

I feel like I need to disagree with 
paragraph 16, so I'm labeling it number 12. The 
research cited in paragraph 16 stems from the 
extensive debate in the '90s and early 2000s about 
whether to use dual system estimation to actually 
adjust the population counts following a census. 
The Census Bureau designed the 2010 non-response 
follow — sorry, the 2010 coverage measurement 
system only to learn about the quality of the 
census and not to learn with sufficient precision 
things that could be adjusted. 

So I believe it's fair to say that the 
census — well, so I — as an expert, I accept 
that dual system estimation would be problematic 
for adjusting population counts. And at the 
Census Bureau, we do not adjust the population 
estimates from one census to another census even 
though we have the dual system estimator that 
would allow us to do that for a variety of 
reasons. 

But we do use dual system estimation as 
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one tool, one very important tool, to try to 
assess the quality of the census count. We don't 
got — draw large enough samples any longer to go 
into the kind of detail that the research in the 
'90s and the early 2000s did in the ACE program — 
I'm sorry, I can't expand the acronym anymore. 

The coverage survey that accompanied the 2000 
census was 750,000 households, not 170,000 
households. 

Q. Okay. Thank you. 

A. So I disagree with paragraph 17. I'm 
marking it number 13. The official net undercount 
for 2010 was a minuscule overcount that was 
swamped by its standard error. And so it was not 
an undercount. So that's just factually wrong. 

As regards differential undercounts, we 
did have a differential undercount for some of the 
populations noted in this paragraph, but it was 
smaller in 2010 than it was in 2000. 

Q. I'm sorry, it was smaller in 2010 that in 

2000 ? 

A. Yes . 
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I disagree with paragraph 14 — sorry, 
paragraph 18. I'm labeling it number 14. 

I did present quantitative evidence. I 
distinguished carefully between net undercount and 
other measures of census quality. And my report 
documents those quantitative effects that I think 
can be documented. 

I disagree with paragraph 26. 

Q. Number 15. 

A. Number 15. 

Q. If you could mark that number 15. 

A. I acknowledge that the report does 
indicate how the operation of NRFU would work, and 
that does involve lots of moving of enumerators 
around in response to actual as opposed to 
predicted self-response. But I also said that the 
integrated partnership and communication program 
would work with the partners and is already hiring 
partners to work with in order to establish with 
the trusted voices how to message that 
participation in the census is still very 
important and the community itself will be harmed 
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say, this is the paragraph I'm relying on when I 
criticize Dr. Barreto. Is that what you want me 
to do? 

Q. I'm not asking for you to say — I mean, 
obviously, on the testing point, we understand 
your position on testing, and we'll spend some 
time talking about that as well. 

I'm trying to understand from your report 
what one could glean your criticisms of 
Dr. Barreto are. If it's in the report. If it's 
not in your report, then, you know, we can discuss 
that . 

A. So my report deals with the specific ways 
in which we intend to mitigate the consequences of 
placing the citizenship question on the 
2020 census. My principal criticism of 
Dr. Barreto's report was that he misinterpreted 
many of the things that I said. And when they 
were interpreted properly and in the context of 
what I wrote, they support many of the points that 
he was trying to make, but they don't support a 
conclusion about net undercount. 
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I feel that the failure to distinguish 
between the components of the quality of the 
census and a quantitative estimate of the effect 
on net undercount is an important expert point. 

And I defend in the report the position that we 
did document that the question itself was going to 
cause difficulties in conducting the census. 

We weren't — we weren't asked to design 
a 2020 census that could produce a CVAP table and 
make the decision ourself about whether the 
question would be on the census. 

From an expert point of view, the 
quantitative question of where do the effects show 
up are addressed in my report. 

The conclusion that that means 
necessarily that we expect a larger net undercount 
is unsupported by the data. It does mean that we 
are going to have more difficulty estimating the 
count. There's no question about that. And that 
we're going to have much less reliable data to the 
extent that we don't get self-responses on some 
portions of the population that don't cooperate. 
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These are distinct quality measures that 


we 

have -■ 

- I have consistently. 

both in fact and 


in 

expert 

testimony, identified 

and quantified. 


The fact that they can't be used directly to 
produce a net undercount estimate hasn't affected 
my opinion about whether the question should be on 
the census. It hasn't affected the Census 
Bureau's recommendation about the question. 

It seems to me to be something that is 
undocumented by the plaintiffs' experts and that I 
specifically call out as undocumented in my expert 
report. 

Q. So I — I understand that. And we'll 
spend some time going through that language and — 
just to make sure we understand exactly what your 
view is. 

You had a range, though, of criticisms of 
Dr. Barreto, for example, in his main report about 
his survey methodology and — 

A. Right. 

Q. — survey design. Is any of that in your 
report? 
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A. Oh, I see, okay. 

Q. I am cheating. 

What — can you just explain that? What 
is the difference and why is it an important 
difference? 

A. So my expert report attempts to document 
why we believe that it's possible — likely that 
we can conduct a non-response follow-up operation 
as designed and get responses for the vast 
majority of the addresses. 

We consider that operation to be 
successful for the purposes of the actual 
enumeration account if we have a small 
percentage — very small percentage — that go to 
count imputation. And we use count imputation to 
do the rest. Count imputation is the best 
available method for correcting what should be a 
very small percentage of the addresses that we 
can't re s olve. 

Q. For example, what was — 

A. The rest — 

Q. Sorry. 
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A. The rest of that nuanced answer was 
basically saying, if you go into non-response 
follow-up with less self-response, you get lower 
quality data. And I don't think anyone is arguing 
about that. 

But we've had declining self-response for 
decades, and we've had the opportunity to refine 
the non-response follow-up system so that it could 
ultimately get to the constitutional objective of 
the census, which is to get an actual enumeration 
within the resources that we are given to do that. 

And so the question of mitigating the 
data quality consequences of a citizenship 
question has two parts. It's a mitigation if you 
can use the operation of the NRFU as you designed 
it and still accomplish its purpose. And that's 
what my expert report is designed to inform. 

That's — its ultimate purpose is to get that 
constitutional enumeration as accurate as we can 
get it . 

There are many, many other uses for the 
census. They depend much more critically on the 
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quality of the 


data, and they will 


be affected. And I've testified to that as an 
expert and as a fact witness. And I don't believe 
my expert report or the other expert reports 
challenges that. 

Q. That's fair. So just in terms of helping 
us understand, so before the citizenship question 
was adopted, was there a target or a parameter for 
the percentage of NRFU that would ultimately go to 
enumeration that would constitute a successful 
NRFU operation? I'm sorry, I said enumeration, 
but I meant imputation. Was there a set 
percentage or a goal or a target for imputation 
that would define a successful NRFU operation? 

A. So — I'm pausing because I don't think 
anything in the operational plan or the life cycle 
cost estimate depends upon the number of 
households that get to count imputation. The 
closest is the estimated workload that goes to 6. 
And I do — in my report — if you'll just give me 
a second — I think I — I think I have that in 
here . 
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increases the self — that lowers the 
se1f-response rate and then increases the 
non-response follow-up identifies the operations 
required in order to increase the — in order to 
get responses from the increased workload that 
that will involve. 

So that is not a pious hope. That is 
saying if the objective is to get through the 
entire master address file and resolve the 
occupancy status for each of those addresses and 
then get a questionnaire for occupied households, 
then we have demonstrated why we — I'm sorry. 

I believe that the NRFU process in place 
for the Census Bureau is designed to do that and 
can be operated effectively to do that. That is 
evidence, because the goal of that process is to 
resolve the occupancy status — the first goal in 
that process is to resolve the occupancy status of 
each of the addresses. The second goal is to get 
a questionnaire. 

The overall census goal, count every 
person once, only once, and in the right place. 
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depends critically on the first goal of resolving 
the occupancy status and getting a count inside 
the occupied households. 

The remaining statutory and social uses 
of the data depend critically on the quality of 
the responses that we get. And our ability to 
evaluate the quality of those responses also 
depends on getting data on those households. 

There's no disagreement that those data 
are not going to be as good. But all the way to 
the count, if you can establish that the design of 
the NRFU should be successful in resolving the 
count status of every address, then to the 
enumeration question, the net undercount, there is 
evidence. It's — there's no evidence — let me 
put this differently. 

The effects that are documented in my 
expert report are intended to show that it's going 
to be more expensive and produce lower quality 

data. But we couldn't find a quantitative effect 
from that process on the net undercount. That's 
not a pious hope. That is scientific evidence 
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from the design of the NRFU. 

Q. Okay. Thank you. 

A. Page 16. The single-sentence paragraph 
that begins, "Whatever method used, imputation 
further systematically disadvantages hard-to-count 
subpopulations, in particular, non-citizens and 
households containing non-citizens." 

Q. You're up to number 5. 

A. Number 5, yeah. I don't accept that 
there's been any quantitative evidence presented 
that that conclusion applies to count imputation. 

Q. Is there any evidence, period? You said 
quantitative — 

A. Well, I'll acknowledge that 



imputation has those problems. 


Q. With regard to count imputation, is there 
any evidence, even qualitative evidence, that 
there's a difference? 

A. Well — yes, there's qualitative 
evidence, ethnographic case studies and other 
follow-ups that the Census Bureau and other 
demographers have conducted. I acknowledge that. 
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THE WITNESS: — was not significantly 

different — I'm just reading, I'm sorry. "The 
Census Bureau's coverage evaluation from the 19 — 
from the 2010 census showed that net undercount 
for New York City overall, and for each of the 
five boroughs, was not significantly different 
from zero — no undercount, in other words (the 
same was true for 2000) . However, when we look at 
the components of net undercount, we see a 
different perspective on the census error." 

I agree with that conclusion. We use 
slightly different implementations of the dual 
system estimator. He has already subtracted off 
whole-person census imputations, and the Census 
Bureau leaves them as a separate component, but 
they're not measured with error because they're 
identified on each census record, so there's no 
sampling error associated with them. 

So — so in his world, omissions — or 
net undercount is omissions minus erroneous 
enumerations. And in my world, net undercount is 
omissions minus erroneous enumerations minus 
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whole-person census imputations. And in his 
world, the target is the data-defined census 
count. And in my world, the target is the actual 
census count. 

Those are very subtle differences in the 
way you use net undercount statistics. But 
there's no disagreement among the experts in this 
case that the components are all — are going to 

move around. And that is what I meant when I 
wrote, in my official capacity, degradation of the 

quality of the census data. But he's also quite 
aware that they can cancel out. And the point 
he's made about New York City here shows that they 
do cancel out in some cases. They don't always 
cancel out. I'm not saying they always cancel 
out. I am saying that you really have to do the 
whole analysis with the quantifiers in order to 
say what the effect of the net undercount is. 

As to the components, I accept that. 

Q. Thank you. If you could continue. 

A. Okay. I think the right place to mark it 
is on page 15, the last partial paragraph and as 


Veritext Legal Solutions 

215 - 241-1000 ~ 610 - 434-8588 ~ 302 - 571-0510 ~ 202 - 803-8830 





















1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 

21 

22 


Case 3:18-cv-01865-RS Dc^urp^nt^^^^le^jQ^/OS/lQ Page 59 of 59 


Page 123 

Q. Thank you. Please continue. 

A. On page 20, the conclusion — I think 
this is just a restating of the conclusion that I 
called out on — in the initial bullets. But the 
paragraph that begins, "The Abowd report." 

And am I at 5 or 6? 

Q. You're at, I think, actually, 4. 

A. All right. I'll number it 4. 

I believe that my report and my other 
testimony fully acknowledge the data quality 

consequences of having to use more NRFU. 

Q. Okay. Any — do you have any other 
criticisms of — 

A. No . 

Q. — Mr. Salvo's report? 

A. No . 

MR. FREEDMAN: We will break for lunch. 

BY MR. FREEDMAN: 

Q. Before we do, just for estimating 
purposes, have you reviewed the reports of Bernard 
Fraga ? 

A. No . 
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